new language
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Bulgaria (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (2 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Bulgaria (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (2 more...)
Improving Language Plasticity via Pretraining with Active Forgetting
Pretrained language models (PLMs) are today the primary model for natural language processing. Despite their impressive downstream performance, it can be difficult to apply PLMs to new languages, a barrier to making their capabilities universally accessible. While prior work has shown it possible to address this issue by learning a new embedding layer for the new language, doing so is both data and compute inefficient. We propose to use an active forgetting mechanism during pretraining, as a simple way of creating PLMs that can quickly adapt to new languages. Concretely, by resetting the embedding layer every K updates during pretraining, we encourage the PLM to improve its ability of learning new embeddings within limited number of updates, similar to a meta-learning effect. Experiments with RoBERTa show that models pretrained with our forgetting mechanism not only demonstrate faster convergence during language adaptation, but also outperform standard ones in a low-data regime, particularly for languages that are distant from English.
Persian-Phi: Efficient Cross-Lingual Adaptation of Compact LLMs via Curriculum Learning
Akhlaghi, Amir Mohammad, Shabani, Amirhossein, Abdolmaleki, Mostafa, Kheradpisheh, Saeed Reza
The democratization of AI is currently hindered by the immense computational costs required to train Large Language Models (LLMs) for low-resource languages. This paper presents Persian-Phi, a 3.8B parameter model that challenges the assumption that robust multilingual capabilities require massive model sizes or multilingual baselines. We demonstrate how Microsoft Phi-3 Mini -- originally a monolingual English model -- can be effectively adapted to Persian through a novel, resource-efficient curriculum learning pipeline. Our approach employs a unique "warm-up" stage using bilingual narratives (Tiny Stories) to align embeddings prior to heavy training, followed by continual pretraining and instruction tuning via Parameter-Efficient Fine-Tuning (PEFT). Despite its compact size, Persian-Phi achieves competitive results on Open Persian LLM Leaderboard in HuggingFace. Our findings provide a validated, scalable framework for extending the reach of state-of-the-art LLMs to underrepresented languages with minimal hardware resources. The Persian-Phi model is publicly available at https://huggingface.co/amirakhlaghiqqq/PersianPhi.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- North America > United States > Virginia (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
FastPOS: Language-Agnostic Scalable POS Tagging Framework Low-Resource Use Case
Kafi, Md Abdullah Al, Banshal, Sumit Kumar
This study proposes a language-agnostic transformer-based POS tagging framework designed for low-resource languages, using Bangla and Hindi as case studies. With only three lines of framework-specific code, the model was adapted from Bangla to Hindi, demonstrating effective portability with minimal modification. The framework achieves 96.85 percent and 97 percent token-level accuracy across POS categories in Bangla and Hindi while sustaining strong F1 scores despite dataset imbalance and linguistic overlap. A performance discrepancy in a specific POS category underscores ongoing challenges in dataset curation. The strong results stem from the underlying transformer architecture, which can be replaced with limited code adjustments. Its modular and open-source design enables rapid cross-lingual adaptation while reducing model design and tuning overhead, allowing researchers to focus on linguistic preprocessing and dataset refinement, which are essential for advancing NLP in underrepresented languages.
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > India > Karnataka (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > China > Beijing > Beijing (0.04)
The Bambu Lab A1 is the best multi-color 3D printer for beginners and it's up to 260 off at Amazon
Gear The Bambu Lab A1 is the best multi-color 3D printer for beginners and it's up to $260 off at Amazon Bambu Lab 3D printers rarely go on-sale, but Amazon has every version of the auto-leveling A1 deeply discounted right now. We may earn revenue from the products available on this page and participate in affiliate programs. Bambu Labs makes some of our very favorite 3D printers on the market, but they rarely go on sale. Even on big shopping holidays, they tend to hover at or around retail prices. Right now, Amazon has one of the company's most popular (and accessible) models on huge discount.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Bulgaria (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (2 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Bulgaria (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (2 more...)
Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR
Yang, Hongli, Li, Sheng, Huang, Hao, Tuohan, Ayiduosi, Peng, Yizhou
Recent advancements in multilingual automatic speech recognition (ASR) have been driven by large-scale end-to-end models like Whisper. However, challenges such as language interference and expanding to unseen languages (language expansion) without degrading performance persist. This paper addresses these with three contributions: 1) Entire Soft Prompt Tuning (Entire SPT), which applies soft prompts to both the encoder and decoder, enhancing feature extraction and decoding; 2) Language-A ware Prompt Tuning (LAPT), which leverages cross-lingual similarities to encode shared and language-specific features using lightweight prompt matrices; 3) SPT - Whisper, a toolkit that integrates SPT into Whisper and enables efficient continual learning. Experiments across three languages from FLEURS demonstrate that Entire SPT and LAPT outperform Decoder SPT by 5.0% and 16.0% in language expansion tasks, respectively, providing an efficient solution for dynamic, multilingual ASR models with minimal computational overhead.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > China (0.05)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (5 more...)